Here I want to apply the projected neighbors graph visualization to the pancreas dataset that is used in the scVelo demo and compare it to the visualization on the U2OS dataset.

Setup and get data from scVelo

Use the reticulate package to use scVelo from within R:

Compute velocities on pancreas data using velocyto

Extract spliced and unspliced data

Extract PCA coordinates

Filter genes

Downsample cells to make things easier

Normalize for dimensional reduction

## Warning in if (!class(counts) %in% c("dgCMatrix", "dgTMatrix")) {: the condition
## has length > 1 and only the first element will be used
## Converting to sparse matrix ...
## Normalizing matrix with 1232 cells and 8724 genes

Dimensional reduction

Run velocyto on panc data

Graph visualization

Scores of observed and projected states in PC space

Graph visualization on subset of cells from PC coordinates

Graph visualization on subset of cells from gene expression
using common.genes (intersect of overdispersed genes, odsGenes, and genes in velocity output (genes with high correlation b/w spliced and unspliced))

Graph parameters

Effects of changing k, distance measure, similarity measure, and similarity threshold:
Using PC generated graph

L1 vs L2 as distance measure:

#using k=10, similarity=cosine, threshold=0.25
set.seed(1)
graphViz(curr.scores.cellsub,proj.scores.cellsub,10,"L1","cosine",0.25,cell.cols.grph,"L1 distance")

graphViz(curr.scores.cellsub,proj.scores.cellsub,10,"L2","cosine",0.25,cell.cols.grph,"L2 distance")

Pearson correlation vs Cosine similarity:

set.seed(1)
graphViz(curr.scores.cellsub,proj.scores.cellsub,10,"L2","cosine",0.25,cell.cols.grph,"Cosine Similarity")

graphViz(curr.scores.cellsub,proj.scores.cellsub,10,"L2","pearson",-0.5,cell.cols.grph,"Pearson Correlation") 

..looks like correlation is more conservative than cosine similarity.

Number of out edges k:

Similarity threshold:

## [1] "Done finding neighbors"
## [1] "Done making graph"

## delta projections ... sqrt knn ... transition probs ... done
## calculating arrows ... done
## grid estimates ... grid.sd= 0.09642131  min.arrow.size= 0.001928426  max.grid.arrow.length= 0.0610458  done
## [1] "Done finding neighbors"
## [1] "Done making graph"

## delta projections ... sqrt knn ... transition probs ... done
## calculating arrows ... done
## grid estimates ... grid.sd= 0.0930334  min.arrow.size= 0.001860668  max.grid.arrow.length= 0.0610458  done
## [1] "Done finding neighbors"
## [1] "Done making graph"

## delta projections ... sqrt knn ... transition probs ... done
## calculating arrows ... done
## grid estimates ... grid.sd= 0.1031384  min.arrow.size= 0.002062768  max.grid.arrow.length= 0.0610458  done
## [1] "Done finding neighbors"
## [1] "Done making graph"

## delta projections ... sqrt knn ... transition probs ... done
## calculating arrows ... done
## grid estimates ... grid.sd= 0.1177816  min.arrow.size= 0.002355632  max.grid.arrow.length= 0.0610458  done
## [1] "Done finding neighbors"
## [1] "Done making graph"

## delta projections ... sqrt knn ... transition probs ... done
## calculating arrows ... done
## grid estimates ... grid.sd= 0.08881022  min.arrow.size= 0.001776204  max.grid.arrow.length= 0.0610458  done
## [1] "Done finding neighbors"
## [1] "Done making graph"

## delta projections ... sqrt knn ... transition probs ... done
## calculating arrows ... done
## grid estimates ... grid.sd= 0.09028403  min.arrow.size= 0.001805681  max.grid.arrow.length= 0.0610458  done

Velocity confidence from scvelo

Consistency score

Cell consistency score: mean correlation b/w cell’s velocity and velocities of nearest neighbors
.. find n nearest neighbors for each cell e.g…

.. calculate consistency score for each cell..

Cell consistency scores on embedding Blue=low, Red=high

Graph parameters consistency scores

Number of out edges k:

Similarity threshold:

Consistency of fdg compared to other embeddings

Consistency score in FDG compared to PCA and UMAP computed on same cell subset

## Warning in vattrs[[name]][index] <- value: number of items to replace is not a
## multiple of replacement length

## [1] "Mean consistency scores for PCA, UMAP, FDG"
## [1] 0.4343842
## [1] 0.4314783
## [1] 0.5199737
## [1] "Median consistency scores for PCA, UMAP, FDG"
## [1] 0.4321841
## [1] 0.4309548
## [1] 0.5485402